課程資訊
課程名稱
資料科學方法論
Computational Methods for Data Science 
開課學期
111-2 
授課對象
電機資訊學院  資料科學碩士學位學程  
授課教師
張明中 
課號
Data5010 
課程識別碼
946 U0100 
班次
 
學分
3.0 
全/半年
半年 
必/選修
選修 
上課時間
星期三6,7,8(13:20~16:20) 
上課地點
新504 
備註
限碩士班以上
總人數上限:30人 
 
課程簡介影片
 
核心能力關聯
核心能力與課程規劃關聯圖
課程大綱
為確保您我的權利,請尊重智慧財產權及不得非法影印
課程概述

第一週 Introduction to Data Science and Matrix Algebra
第二週 Data Collection: Survey sampling
第三週 Data Collection: Factorial design and Space-filling design
第四週 Data Analysis: Supervised Learning I – Linear model and Generalized linear model
第五週 Data Analysis: Supervised Learning II -- Nonparametric regression
第六週 Data Analysis: Supervised Learning III -- Gaussian process regression
第七週 Data Analysis: Supervised Learning IV -- Discriminate analysis
第八週 Data Analysis: Supervised Learning V -- Support vector machine
第九週 期中考週
第十週 Data Analysis: Supervised Learning VI -- Bagging, Random forests, Boosting
第十一週 Data Analysis: Supervised Learning VII -- Deep neural networks
第十二週 Data Analysis: Unsupervised Learning I – Principal component analysis, Factor analysis, Canonical correlation analysis
第十三週 Data Analysis: Unsupervised Learning II -- Clustering methods
第十四週 Big Data Issue I (p>>n): Feature screening
第十五週 Big Data Issue II (n>>p): Subdata selection
第十六週 期末考週
第十七週 彈性教學
第十八週 彈性教學 

課程目標
The aim of this course is to introduce a variety of Statistical and Machine Learning data analysis methods. Three core techniques for data science: Data collection, Supervised learning, and Unsupervised learning, are introduced in detail. Some recent developments in Big/High-dimensional data are involved. The software I will be using for the course is R (website: https://www.r-project.org/). 
課程要求
Basic statistical concepts/theories and programming techniques are required, where the course Data5004 Statistical Foundations of Data Science (I) is helpful for understanding the materials in this course. This course will be graded by Homework assignments (20%), Project presentations (40%), and Paper presentations (40%). Students can use any software, not limited to R, for programming in their project presentations. 
預期每週課後學習時數
 
Office Hours
另約時間 備註: mcchang@stat.sinica.edu.tw 
指定閱讀
 
參考書目
教科書:
1. James, G., Witten, D., Hastie, T., and Tibshirani, R. (2021), An Introduction to Statistical Learning: with Applications in R, 2nd Edition, Springer
參考書目:
1. Johnson, R.A. and Wichern, D.W. (2007), Applied Multivariate Statistical Analysis,6th edition, Prentice Hall.
2. James, G., Witten, D., Hastie, T., and Tibshirani, R. (2009), The Elements of Statistical Learning, 2nd Edition, Springer
3. Fan, J., Li, R., Zhang, C.-H., and Zou, H. (2020), Statistical Foundations of Data Science, CRC Press
4. Selective Papers 
評量方式
(僅供參考)
   
課程進度
週次
日期
單元主題
無資料